Framework
Model
Trainer
class eole.utils.Statistics(loss=0, auxloss=0, n_batchs=0, n_sents=0, n_tokens=0, n_correct=0, computed_metrics=None, data_stats=None, attention_entropy=0, n_attention_samples=0)
Bases: object
Accumulator for loss statistics. Currently calculates:
- accuracy
- perplexity
- elapsed time
accuracy()
compute accuracy
static all_gather_stats(stat, max_size=4096)
Gather a Statistics object accross multiple process/nodes
- Parameters:
- stat**(** – obj:Statistics): the statistics object to gather accross all processes/nodes
- max_size (int) – max buffer size to use
- Returns: Statistics, the update stats object
static all_gather_stats_list(stat_list, max_size=4096)
Gather a Statistics list accross all processes/nodes
- Parameters:
- stat_list (list([Statistics])) – list of statistics objects to gather accross all processes/nodes
- max_size (int) – max buffer size to use
- Returns: list of updated stats
- Return type: our_stats(list([Statistics]))
avg_attention_entropy()
compute average attention entropy
computed_metric(metric)
check if metric(TER/BLEU) is computed and return it
elapsed_time()
compute elapsed time
log_tensorboard(prefix, writer, learning_rate, patience, step)
display statistics to tensorboard
output(step, num_steps, learning_rate, start)
Write out statistics to stdout.
- Parameters:
- step (int) – current step
- n_batch (int) – total batches
- start (int) – start time of step.
ppl()
compute perplexity
update(stat, update_n_src_tokens=False)
Update statistics by suming values with another Statistics object
- Parameters:
- stat – another statistic object
- update_n_src_tokens (bool) – whether to update (sum) n_src_tokens or not
xent()
compute cross entropy
Loss
Optimizer
class eole.utils.Optimizer(optimizer, learning_rate, learning_rate_decay_fn=None, max_grad_norm=None, use_amp=True)
Bases: object
Controller class for optimization. Mostly a thin wrapper for optim, but also useful for implementing rate scheduling beyond what is currently available. Also implements necessary methods for training RNNs such as grad manipulations.
- Parameters:
- optimizer – A
torch.optim.Optimizerinstance. - learning_rate – The initial learning rate.
- learning_rate_decay_fn – An optional callable taking the current step as argument and return a learning rate scaling factor.
- max_grad_norm – Clip gradients to this global norm.
- optimizer – A
property amp
True if use torch amp mix precision training.
backward(loss)
Wrapper for backward pass. Some optimizer requires ownership of the backward pass.
classmethod from_config(model, config, checkpoint=None)
Builds the optimizer from options.
- Parameters:
- cls – The
Optimizerclass to instantiate. - model – The model to optimize.
- opt – The dict of user options.
- checkpoint – An optional checkpoint to load states from.
- cls – The
- Returns:
An
Optimizerinstance.
learning_rate(step=None)
Returns the current learning rate.
step()
Update the model parameters based on current gradients.
Optionally, will employ gradient modification or update learning rate.
property training_step
The current training step.
zero_grad(set_to_none=True)
Zero the gradients of optimized parameters.
class eole.utils.AdaFactor(params, lr=None, beta1=0.9, beta2=0.999, eps1=1e-30, eps2=0.001, cliping_threshold=1, non_constant_decay=True, enable_factorization=True, ams_grad=True, weight_decay=0)
Bases: Optimizer
step(closure=None)
Perform a single optimization step to update parameter.
- Parameters: closure (Callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.